Skip to content

Add tiered-storage module with stored fields prefetch support#20962

Open
GeekGlider wants to merge 1 commit intoopensearch-project:mainfrom
GeekGlider:feature/tiered-storage-module
Open

Add tiered-storage module with stored fields prefetch support#20962
GeekGlider wants to merge 1 commit intoopensearch-project:mainfrom
GeekGlider:feature/tiered-storage-module

Conversation

@GeekGlider
Copy link

@GeekGlider GeekGlider commented Mar 23, 2026

Description

Adds the tiered-storage module to support writable warm index features. This module introduces stored fields prefetching for tiered storage indices, which prefetches stored fields during the fetch phase of search to improve read performance on warm indices.

The module includes:

  • TieredStoragePlugin — the main plugin class that registers cluster settings and a search operation listener
  • StoredFieldsPrefetch — a SearchOperationListener that prefetches stored fields when the feature flag is enabled
  • TieredStoragePrefetchSettings — cluster-level settings to control prefetch behavior (read-ahead block count and enable/disable toggle)
  • Unit tests for prefetch settings and the stored fields prefetch listener

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@GeekGlider GeekGlider requested a review from a team as a code owner March 23, 2026 09:23
@github-actions
Copy link
Contributor

github-actions bot commented Mar 23, 2026

PR Reviewer Guide 🔍

(Review updated until commit f0959d8)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes

Sub-PR theme: Add TieredStoragePrefetchSettings cluster settings

Relevant files:

  • modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/TieredStoragePrefetchSettings.java
  • modules/tiered-storage/src/test/java/org/opensearch/storage/prefetch/TieredStoragePrefetchSettingsTests.java

Sub-PR theme: Add StoredFieldsPrefetch listener and TieredStoragePlugin

Relevant files:

  • modules/tiered-storage/src/main/java/org/opensearch/storage/TieredStoragePlugin.java
  • modules/tiered-storage/src/main/java/org/opensearch/storage/package-info.java
  • modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java
  • modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/package-info.java
  • modules/tiered-storage/src/test/java/org/opensearch/storage/prefetch/StoredFieldsPrefetchTests.java
  • CHANGELOG.md

⚡ Recommended focus areas for review

Possible Issue

In executePrefetch, when the reader changes to a new segment and the inner reader is not a SegmentReader, currentReader is set to null and the loop continues. However, currentReaderIndex is already updated to readerIndex, so subsequent docs in the same non-SegmentReader segment will skip the null check at the top of the loop (since currentReaderIndex == readerIndex) and proceed with currentReader == null, hitting the continue correctly. But currentReaderContext will also have been updated to the new context. This is actually fine for the null check path, but the assert currentReaderContext != null on line 82 could mask the real issue: if currentReaderContext is somehow null (e.g., first iteration), the assert is a no-op in production builds. More critically, readerIndex is used in the debug log on line 88 but is declared inside the try block scope — this is a scoping issue that may cause a compile error since readerIndex is referenced in the log statement outside its declaration scope.

int currentReaderIndex = -1;
LeafReaderContext currentReaderContext = null;
StoredFields currentReader = null;
log.debug("Stored Field Execute prefetch was triggered: {}", context.docIdsToLoadSize());
for (int index = 0; index < context.docIdsToLoadSize(); index++) {
    int docId = context.docIdsToLoad()[context.docIdsToLoadFrom() + index];
    try {
        int readerIndex = ReaderUtil.subIndex(docId, context.searcher().getIndexReader().leaves());
        if (currentReaderIndex != readerIndex) {
            currentReaderContext = context.searcher().getIndexReader().leaves().get(readerIndex);
            currentReaderIndex = readerIndex;

            // Unwrap the reader here
            LeafReader innerLeafReader = currentReaderContext.reader();
            while (innerLeafReader instanceof FilterLeafReader) {
                innerLeafReader = ((FilterLeafReader) innerLeafReader).getDelegate();
            }
            // never be the case, just sanity check
            if (!(innerLeafReader instanceof SegmentReader)) {
                // disable prefetch on stored fields for this segment
                log.warn("Unexpected reader type [{}], skipping stored fields prefetch", innerLeafReader.getClass().getName());
                currentReader = null;
                continue;
            }
            currentReader = innerLeafReader.storedFields();
        }
        assert currentReaderContext != null;
        if (currentReader == null) {
            continue;
        }
        log.debug(
            "Prefetching stored fields for index shard: {}, docId: {}, readerIndex: {}",
            context.indexShard().shardId(),
            docId,
            readerIndex
        );

        // nested docs logic
        final int subDocId = docId - currentReaderContext.docBase;
        final int rootDocId = findRootDocumentIfNested(context, currentReaderContext, subDocId);
        if (rootDocId != -1) {
            currentReader.prefetch(rootDocId);
        }
        currentReader.prefetch(subDocId);
    } catch (Exception e) {
        log.warn("Failed to prefetch stored fields for docId: " + docId, e);
    }
}
Incorrect Prefetch Logic

In findRootDocumentIfNested, when bits.get(subDocId) is false (meaning the doc is a nested/child doc), bits.nextSetBit(subDocId) returns the next set bit at or after subDocId. This could return subDocId itself if the bit is set, but the condition already checked it's not set, so it returns the next parent doc. However, nextSetBit may return NO_MORE_DOCS (Integer.MAX_VALUE) if there is no parent doc after subDocId, which would then be passed to currentReader.prefetch() causing an invalid document access. There is no upper-bound check on the return value.

private int findRootDocumentIfNested(SearchContext context, LeafReaderContext subReaderContext, int subDocId) throws IOException {
    if (context.mapperService().hasNested()) {
        BitSet bits = context.bitsetFilterCache().getBitSetProducer(Queries.newNonNestedFilter()).getBitSet(subReaderContext);
        if (bits != null && !bits.get(subDocId)) {
            return bits.nextSetBit(subDocId);
        }
    }
    return -1;
}
Null Dereference Risk

getPrefetchSettingsSupplier() returns a supplier that accesses this.tieredStoragePrefetchSettings, which is only initialized in createComponents(). If onIndexModule() is called before createComponents() (e.g., during plugin initialization ordering), the supplier will return null. StoredFieldsPrefetch.checkIfStoredFieldsPrefetchEnabled() does handle a null settings object gracefully, but this silent null return could mask initialization ordering bugs.

public Supplier<TieredStoragePrefetchSettings> getPrefetchSettingsSupplier() {
    return () -> this.tieredStoragePrefetchSettings;
}

// @Override
// public List<ActionHandler<? extends ActionRequest, ? extends ActionResponse>> getActions() {
// if (FeatureFlags.isEnabled(FeatureFlags.WRITABLE_WARM_INDEX_EXPERIMENTAL_FLAG)) {
// return List.of(
// new ActionHandler<>(HotToWarmTierAction.INSTANCE, TransportHotToWarmTierAction.class),
// new ActionHandler<>(WarmToHotTierAction.INSTANCE, TransportWarmToHotTierAction.class),
// new ActionHandler<>(CancelTieringAction.INSTANCE, TransportCancelTierAction.class),
// new ActionHandler<>(ListTieringStatusAction.INSTANCE, TransportListTieringStatusAction.class),
// new ActionHandler<>(GetTieringStatusAction.INSTANCE, TransportGetTieringStatusAction.class));
// } else {
// return List.of();
// }
// }

// @Override
// public List<RestHandler> getRestHandlers(
// Settings settings, RestController restController,
// ClusterSettings clusterSettings,
// IndexScopedSettings indexScopedSettings, SettingsFilter settingsFilter,
// IndexNameExpressionResolver indexNameExpressionResolver,
// Supplier<DiscoveryNodes> nodesInCluster
// ) {
// if (FeatureFlags.isEnabled(FeatureFlags.WRITABLE_WARM_INDEX_EXPERIMENTAL_FLAG)) {
// return List.of(
// new RestHotToWarmTierAction(),
// new RestWarmToHotTierAction(),
// new RestCancelTierAction(),
// new RestGetTieringStatusAction(),
// new RestListTieringStatusAction());
// } else {
// return List.of();
// }
// }

@Override
public Collection<Object> createComponents(
    Client client,
    ClusterService clusterService,
    ThreadPool threadPool,
    ResourceWatcherService resourceWatcherService,
    ScriptService scriptService,
    NamedXContentRegistry xContentRegistry,
    Environment environment,
    NodeEnvironment nodeEnvironment,
    NamedWriteableRegistry namedWriteableRegistry,
    IndexNameExpressionResolver indexNameExpressionResolver,
    Supplier<RepositoriesService> repositoriesServiceSupplier
    // Tracer tracer,
    // MetricsRegistry metricsRegistry
) {
    // this.tierActionMetrics = new TierActionMetrics(metricsRegistry);
    this.tieredStoragePrefetchSettings = new TieredStoragePrefetchSettings(clusterService.getClusterSettings());
    // this.threadpool.set(threadPool);
    return Collections.emptyList();
}

@Override
public void onIndexModule(IndexModule indexModule) {
    if (FeatureFlags.isEnabled(FeatureFlags.WRITABLE_WARM_INDEX_EXPERIMENTAL_FLAG)) {
        // indexModule.addSearchOperationListener(new TieredStorageSearchSlowLog(indexModule.getIndexSettings()));
        indexModule.addSearchOperationListener(new StoredFieldsPrefetch(getPrefetchSettingsSupplier()));
    }
}

@GeekGlider GeekGlider force-pushed the feature/tiered-storage-module branch from eb42398 to a3c9bdd Compare March 23, 2026 09:25
@github-actions
Copy link
Contributor

github-actions bot commented Mar 23, 2026

PR Code Suggestions ✨

Latest suggestions up to f0959d8

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Handle IOException from storedFields() call

innerLeafReader.storedFields() can throw an IOException, but it is called outside
the try-catch block that wraps the rest of the per-document logic. This means an
IOException from storedFields() would propagate uncaught and abort the entire
prefetch loop. Move the storedFields() call inside the try-catch block or handle the
exception explicitly.

modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java [63-81]

 int readerIndex = ReaderUtil.subIndex(docId, context.searcher().getIndexReader().leaves());
 if (currentReaderIndex != readerIndex) {
     currentReaderContext = context.searcher().getIndexReader().leaves().get(readerIndex);
     currentReaderIndex = readerIndex;
 
     // Unwrap the reader here
     LeafReader innerLeafReader = currentReaderContext.reader();
     while (innerLeafReader instanceof FilterLeafReader) {
         innerLeafReader = ((FilterLeafReader) innerLeafReader).getDelegate();
     }
     // never be the case, just sanity check
     if (!(innerLeafReader instanceof SegmentReader)) {
         // disable prefetch on stored fields for this segment
         log.warn("Unexpected reader type [{}], skipping stored fields prefetch", innerLeafReader.getClass().getName());
         currentReader = null;
         continue;
     }
-    currentReader = innerLeafReader.storedFields();
+    try {
+        currentReader = innerLeafReader.storedFields();
+    } catch (IOException e) {
+        log.warn("Failed to get storedFields for segment, skipping prefetch", e);
+        currentReader = null;
+        continue;
+    }
 }
Suggestion importance[1-10]: 7

__

Why: The storedFields() call at line 80 can throw IOException but is outside the try-catch block, meaning an exception would abort the entire prefetch loop. This is a real robustness issue that should be addressed.

Medium
General
Avoid double-prefetching nested and root documents

When a nested document is found, the root document is prefetched first, but the
sub-document is always prefetched regardless. If subDocId is itself the root
document (i.e., bits.get(subDocId) is true in findRootDocumentIfNested), this is
fine, but if rootDocId != -1 and rootDocId == subDocId, the same document is
prefetched twice. More critically, if the sub-document is a nested child,
prefetching it separately may be unnecessary or wasteful. Consider only prefetching
subDocId when it is not a nested child (i.e., rootDocId == -1).

modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java [95-99]

 final int rootDocId = findRootDocumentIfNested(context, currentReaderContext, subDocId);
 if (rootDocId != -1) {
     currentReader.prefetch(rootDocId);
+} else {
+    currentReader.prefetch(subDocId);
 }
-currentReader.prefetch(subDocId);
Suggestion importance[1-10]: 5

__

Why: The suggestion raises a valid concern about potentially redundant prefetching when a nested document's root is found. However, the original code's intent may be to prefetch both the root and the child document for completeness, and the behavior difference is subtle. The improvement is reasonable but not critical.

Low
Guard against uninitialized settings in onIndexModule

onIndexModule is called once per index, so a new StoredFieldsPrefetch instance is
created for every index. Since StoredFieldsPrefetch is stateless (it only holds a
supplier reference), this is not a bug per se, but it is wasteful. More importantly,
if tieredStoragePrefetchSettings has not yet been initialized (i.e.,
createComponents has not been called), the supplier will return null, and
checkIfStoredFieldsPrefetchEnabled will silently return false rather than failing
visibly. Consider adding a guard or assertion to ensure
tieredStoragePrefetchSettings is initialized before onIndexModule is invoked.

modules/tiered-storage/src/main/java/org/opensearch/storage/TieredStoragePlugin.java [167-172]

 public void onIndexModule(IndexModule indexModule) {
     if (FeatureFlags.isEnabled(FeatureFlags.WRITABLE_WARM_INDEX_EXPERIMENTAL_FLAG)) {
-        // indexModule.addSearchOperationListener(new TieredStorageSearchSlowLog(indexModule.getIndexSettings()));
+        if (tieredStoragePrefetchSettings == null) {
+            throw new IllegalStateException("TieredStoragePrefetchSettings not initialized; createComponents must be called before onIndexModule");
+        }
         indexModule.addSearchOperationListener(new StoredFieldsPrefetch(getPrefetchSettingsSupplier()));
     }
 }
Suggestion importance[1-10]: 4

__

Why: While the concern about tieredStoragePrefetchSettings being null is valid, the checkIfStoredFieldsPrefetchEnabled method already handles the null case gracefully by returning false. Throwing an IllegalStateException would be a breaking change in behavior and may not be appropriate for a plugin lifecycle scenario.

Low

Previous suggestions

Suggestions up to commit 4e87c8d
CategorySuggestion                                                                                                                                    Impact
Possible issue
Avoid double-prefetching nested document stored fields

When a document is nested, subDocId is a child document and rootDocId is the parent.
Prefetching subDocId after rootDocId may be redundant or incorrect — nested child
documents typically don't have their own stored fields entry separate from the root.
You should only prefetch the root document when nesting is detected, and skip
prefetching subDocId in that case to avoid unnecessary or erroneous I/O.

modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java [95-99]

 final int rootDocId = findRootDocumentIfNested(context, currentReaderContext, subDocId);
 if (rootDocId != -1) {
     currentReader.prefetch(rootDocId);
+} else {
+    currentReader.prefetch(subDocId);
 }
-currentReader.prefetch(subDocId);
Suggestion importance[1-10]: 7

__

Why: The current code prefetches both rootDocId and subDocId when nesting is detected, but nested child documents share stored fields with the root document. Prefetching subDocId after rootDocId is redundant and potentially incorrect. The fix to use else is logically sound and avoids unnecessary I/O.

Medium
Handle IOException from storedFields() call safely

innerLeafReader.storedFields() can throw an IOException, but it is called outside
the try/catch block that wraps the rest of the per-document logic. This means an
exception here would propagate uncaught and abort the entire prefetch loop. Move the
storedFields() call inside the try/catch block or handle the exception explicitly.

modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java [63-81]

 int readerIndex = ReaderUtil.subIndex(docId, context.searcher().getIndexReader().leaves());
 if (currentReaderIndex != readerIndex) {
     currentReaderContext = context.searcher().getIndexReader().leaves().get(readerIndex);
     currentReaderIndex = readerIndex;
 
     // Unwrap the reader here
     LeafReader innerLeafReader = currentReaderContext.reader();
     while (innerLeafReader instanceof FilterLeafReader) {
         innerLeafReader = ((FilterLeafReader) innerLeafReader).getDelegate();
     }
     // never be the case, just sanity check
     if (!(innerLeafReader instanceof SegmentReader)) {
         // disable prefetch on stored fields for this segment
         log.warn("Unexpected reader type [{}], skipping stored fields prefetch", innerLeafReader.getClass().getName());
         currentReader = null;
         continue;
     }
-    currentReader = innerLeafReader.storedFields();
+    try {
+        currentReader = innerLeafReader.storedFields();
+    } catch (IOException e) {
+        log.warn("Failed to get storedFields for segment, skipping prefetch", e);
+        currentReader = null;
+        continue;
+    }
 }
Suggestion importance[1-10]: 6

__

Why: The innerLeafReader.storedFields() call at line 80 can throw IOException but is inside the outer try/catch(Exception e) block that wraps the entire per-document logic (lines 62-102), so it is already handled. However, the suggestion to handle it more specifically with a continue is a minor improvement for clarity and resilience.

Low
General
Reuse single listener instance across all index modules

onIndexModule is called once per index, potentially creating a new
StoredFieldsPrefetch instance for every index. Since StoredFieldsPrefetch is
stateless beyond its settings supplier, a single shared instance should be created
(e.g., in createComponents) and reused across all index modules to avoid unnecessary
object allocation.

modules/tiered-storage/src/main/java/org/opensearch/storage/TieredStoragePlugin.java [167-172]

+// In createComponents, add:
+// this.storedFieldsPrefetch = new StoredFieldsPrefetch(getPrefetchSettingsSupplier());
+
+@Override
 public void onIndexModule(IndexModule indexModule) {
     if (FeatureFlags.isEnabled(FeatureFlags.WRITABLE_WARM_INDEX_EXPERIMENTAL_FLAG)) {
-        // indexModule.addSearchOperationListener(new TieredStorageSearchSlowLog(indexModule.getIndexSettings()));
-        indexModule.addSearchOperationListener(new StoredFieldsPrefetch(getPrefetchSettingsSupplier()));
+        indexModule.addSearchOperationListener(storedFieldsPrefetch);
     }
 }
Suggestion importance[1-10]: 4

__

Why: Creating a new StoredFieldsPrefetch per index module is wasteful since the class is stateless beyond its settings supplier. Reusing a single instance is a valid optimization, though the performance impact is minor. The improved_code uses a comment placeholder rather than showing the full implementation, making it less actionable.

Low
Suggestions up to commit b00a2bc
CategorySuggestion                                                                                                                                    Impact
Possible issue
Handle IOException from storedFields() acquisition

innerLeafReader.storedFields() can throw an IOException, but it is called outside
the try-catch block that wraps the rest of the per-document logic. This means an
IOException from storedFields() would propagate uncaught and abort the entire
prefetch loop. Move the storedFields() call inside the try-catch block or handle the
exception explicitly.

modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java [63-81]

 int readerIndex = ReaderUtil.subIndex(docId, context.searcher().getIndexReader().leaves());
 if (currentReaderIndex != readerIndex) {
     currentReaderContext = context.searcher().getIndexReader().leaves().get(readerIndex);
     currentReaderIndex = readerIndex;
 
     // Unwrap the reader here
     LeafReader innerLeafReader = currentReaderContext.reader();
     while (innerLeafReader instanceof FilterLeafReader) {
         innerLeafReader = ((FilterLeafReader) innerLeafReader).getDelegate();
     }
     // never be the case, just sanity check
     if (!(innerLeafReader instanceof SegmentReader)) {
         // disable prefetch on stored fields for this segment
         log.warn("Unexpected reader type [{}], skipping stored fields prefetch", innerLeafReader.getClass().getName());
         currentReader = null;
         continue;
     }
-    currentReader = innerLeafReader.storedFields();
+    try {
+        currentReader = innerLeafReader.storedFields();
+    } catch (IOException e) {
+        log.warn("Failed to get storedFields for segment, skipping prefetch", e);
+        currentReader = null;
+        continue;
+    }
 }
Suggestion importance[1-10]: 7

__

Why: The innerLeafReader.storedFields() call can throw IOException but is currently outside the try-catch block, meaning an exception would propagate uncaught and abort the entire prefetch loop. Wrapping it in a try-catch ensures robustness and consistent error handling.

Medium
Avoid redundant prefetch for nested child documents

When a document is nested, subDocId is a child document and rootDocId is the parent.
Prefetching subDocId after rootDocId may be redundant or incorrect — nested child
documents are typically stored within the root document's stored fields block. You
should only prefetch the root document when nesting is detected, and skip
prefetching the child subDocId separately to avoid unnecessary or duplicate I/O.

modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java [97-101]

 final int rootDocId = findRootDocumentIfNested(context, currentReaderContext, subDocId);
 if (rootDocId != -1) {
     currentReader.prefetch(rootDocId);
+} else {
+    currentReader.prefetch(subDocId);
 }
-currentReader.prefetch(subDocId);
Suggestion importance[1-10]: 6

__

Why: When a document is nested, prefetching both rootDocId and subDocId may result in redundant I/O since nested child documents are stored within the root document's block. Using an else branch to only prefetch subDocId when there's no nesting is a reasonable optimization, though the actual behavior depends on the Lucene storage implementation.

Low
General
Reuse existing supplier method to avoid code duplication

The supplier passed to StoredFieldsPrefetch can return null if
tieredStoragePrefetchSettings is not yet initialized. However,
getPrefetchSettingsSupplier() already handles this case with the same null check.
Using getPrefetchSettingsSupplier() here would be more consistent and avoid
duplicating the null-check logic inline.

modules/tiered-storage/src/main/java/org/opensearch/storage/TieredStoragePlugin.java [170-175]

-indexModule.addSearchOperationListener(new StoredFieldsPrefetch(() -> {
-    if (this.tieredStoragePrefetchSettings == null) {
-        return null;
-    }
-    return this.tieredStoragePrefetchSettings;
-}));
+indexModule.addSearchOperationListener(new StoredFieldsPrefetch(getPrefetchSettingsSupplier()));
Suggestion importance[1-10]: 5

__

Why: The inline lambda duplicates the null-check logic already present in getPrefetchSettingsSupplier(). Using the existing method improves consistency and reduces code duplication, though the impact is minor.

Low
Suggestions up to commit a6eb66a
CategorySuggestion                                                                                                                                    Impact
Possible issue
Handle IOException from storedFields() acquisition

innerLeafReader.storedFields() can throw an IOException, but it is called outside
the try/catch block that wraps the rest of the per-document logic. This means an
IOException from storedFields() would propagate uncaught and abort the entire
prefetch loop. Move the storedFields() call inside the try/catch block or handle the
exception explicitly.

modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java [63-81]

 int readerIndex = ReaderUtil.subIndex(docId, context.searcher().getIndexReader().leaves());
 if (currentReaderIndex != readerIndex) {
     currentReaderContext = context.searcher().getIndexReader().leaves().get(readerIndex);
     currentReaderIndex = readerIndex;
 
     // Unwrap the reader here
     LeafReader innerLeafReader = currentReaderContext.reader();
     while (innerLeafReader instanceof FilterLeafReader) {
         innerLeafReader = ((FilterLeafReader) innerLeafReader).getDelegate();
     }
     // never be the case, just sanity check
     if (!(innerLeafReader instanceof SegmentReader)) {
         // disable prefetch on stored fields for this segment
         log.warn("Unexpected reader type [{}], skipping stored fields prefetch", innerLeafReader.getClass().getName());
         currentReader = null;
         continue;
     }
-    currentReader = innerLeafReader.storedFields();
+    try {
+        currentReader = innerLeafReader.storedFields();
+    } catch (IOException e) {
+        log.warn("Failed to get storedFields for segment, skipping prefetch", e);
+        currentReader = null;
+        continue;
+    }
 }
Suggestion importance[1-10]: 7

__

Why: The innerLeafReader.storedFields() call can throw IOException but is currently outside the try/catch block, meaning an exception would abort the entire prefetch loop. Moving it inside the catch block prevents this failure mode and improves robustness.

Medium
General
Avoid redundant prefetch of nested child documents

When a document is nested, subDocId is a child document and rootDocId is the parent.
Prefetching subDocId when it is a nested (non-root) document may be unnecessary or
incorrect, since the stored fields of interest are on the root document. Consider
only prefetching subDocId when it is not a nested child (i.e., when rootDocId ==
-1).

modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java [97-101]

 final int rootDocId = findRootDocumentIfNested(context, currentReaderContext, subDocId);
 if (rootDocId != -1) {
     currentReader.prefetch(rootDocId);
+} else {
+    currentReader.prefetch(subDocId);
 }
-currentReader.prefetch(subDocId);
Suggestion importance[1-10]: 6

__

Why: When a document is nested, prefetching both the root and the child subDocId may be redundant or incorrect since stored fields are on the root document. The suggested change to only prefetch subDocId when it's not a nested child is a reasonable optimization, though the impact depends on the actual data model.

Low
Restrict prefetch listener to tiered-storage indices only

The StoredFieldsPrefetch listener is registered for every index module when the
feature flag is enabled, regardless of whether the index is actually a
tiered-storage index. This could cause unnecessary overhead on all indices. Consider
adding a check on the index settings or index type before registering the listener.

modules/tiered-storage/src/main/java/org/opensearch/storage/TieredStoragePlugin.java [170-175]

-indexModule.addSearchOperationListener(new StoredFieldsPrefetch(() -> {
-    if (this.tieredStoragePrefetchSettings == null) {
-        return null;
-    }
-    return this.tieredStoragePrefetchSettings;
-}));
+if (indexModule.getIndexSettings().getSettings().get("index.store.type", "").contains("tiered")) {
+    indexModule.addSearchOperationListener(new StoredFieldsPrefetch(() -> {
+        if (this.tieredStoragePrefetchSettings == null) {
+            return null;
+        }
+        return this.tieredStoragePrefetchSettings;
+    }));
+}
Suggestion importance[1-10]: 4

__

Why: The suggestion to restrict the listener to tiered-storage indices is a valid design concern, but the improved_code uses a hardcoded string check on index.store.type that may not be the correct mechanism in this codebase. The approach is speculative and may not reflect the actual index type detection pattern used in OpenSearch.

Low
Suggestions up to commit 799eb95
CategorySuggestion                                                                                                                                    Impact
Possible issue
Skip only affected segment on unexpected reader type

When an unexpected (non-SegmentReader) reader type is encountered, the method
returns immediately, skipping all remaining documents across all segments — not just
the current segment. This is overly aggressive; the issue may be isolated to one
segment. Consider using continue to skip only the current segment and attempt
prefetching for subsequent segments.

modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java [63-80]

 int readerIndex = ReaderUtil.subIndex(docId, context.searcher().getIndexReader().leaves());
 if (currentReaderIndex != readerIndex) {
     currentReaderContext = context.searcher().getIndexReader().leaves().get(readerIndex);
     currentReaderIndex = readerIndex;
 
     // Unwrap the reader here
     LeafReader innerLeafReader = currentReaderContext.reader();
     while (innerLeafReader instanceof FilterLeafReader) {
         innerLeafReader = ((FilterLeafReader) innerLeafReader).getDelegate();
     }
     // never be the case, just sanity check
     if (!(innerLeafReader instanceof SegmentReader)) {
         // disable prefetch on stored fields for this case
         log.warn("Unexpected reader type [{}], skipping stored fields prefetch", innerLeafReader.getClass().getName());
-        return;
+        currentReader = null;
+        continue;
     }
     currentReader = innerLeafReader.storedFields();
 }
+if (currentReader == null) {
+    continue;
+}
Suggestion importance[1-10]: 7

__

Why: The current return statement aborts prefetching for all remaining documents when a single segment has an unexpected reader type. Using continue with a null guard is a more targeted fix that allows other segments to still be prefetched, improving robustness.

Medium
Avoid duplicate prefetch for nested documents

When a document is nested, subDocId is a child document and rootDocId is the parent.
Prefetching subDocId after rootDocId may be redundant or incorrect since nested
child documents' stored fields are typically stored with the root document. More
critically, if rootDocId == subDocId (i.e., the doc is already the root), both
prefetch calls will be made unnecessarily. Consider skipping the subDocId prefetch
when a valid rootDocId is found.

modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java [93-97]

 final int rootDocId = findRootDocumentIfNested(context, currentReaderContext, subDocId);
 if (rootDocId != -1) {
     currentReader.prefetch(rootDocId);
+} else {
+    currentReader.prefetch(subDocId);
 }
-currentReader.prefetch(subDocId);
Suggestion importance[1-10]: 6

__

Why: The suggestion correctly identifies that when a nested root document is found, prefetching subDocId (the child) is likely redundant since stored fields are typically stored with the root. Using else avoids the unnecessary second prefetch call, improving efficiency.

Low
General
Handle missing root document for nested docs

BitSet.nextSetBit(subDocId) may return -1 if no set bit exists at or after subDocId,
which would be misinterpreted by the caller as "no root document found" — but -1 is
also the sentinel value for "not nested". While unlikely in practice, if nextSetBit
returns -1 (no parent found), the caller would skip the prefetch silently. Add a
guard to handle this edge case explicitly.

modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java [104-112]

 private int findRootDocumentIfNested(SearchContext context, LeafReaderContext subReaderContext, int subDocId) throws IOException {
     if (context.mapperService().hasNested()) {
         BitSet bits = context.bitsetFilterCache().getBitSetProducer(Queries.newNonNestedFilter()).getBitSet(subReaderContext);
         if (bits != null && !bits.get(subDocId)) {
-            return bits.nextSetBit(subDocId);
+            int rootDocId = bits.nextSetBit(subDocId);
+            if (rootDocId != -1) {
+                return rootDocId;
+            }
+            log.warn("Could not find root document for nested docId: {}", subDocId);
         }
     }
     return -1;
 }
Suggestion importance[1-10]: 4

__

Why: The suggestion correctly notes that nextSetBit can return -1, but in practice this is already handled correctly since -1 is the sentinel value and the caller checks rootDocId != -1. The added warning log provides minor diagnostic value but the functional change is minimal.

Low
Suggestions up to commit a3c9bdd
CategorySuggestion                                                                                                                                    Impact
Possible issue
Add null check before accessing BitSet

The bits variable returned by getBitSet can be null if no documents match the filter
for the given leaf context. Calling bits.get(subDocId) without a null check will
throw a NullPointerException at runtime. Add a null guard before accessing the
BitSet.

modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java [104-112]

 private int findRootDocumentIfNested(SearchContext context, LeafReaderContext subReaderContext, int subDocId) throws IOException {
     if (context.mapperService().hasNested()) {
         BitSet bits = context.bitsetFilterCache().getBitSetProducer(Queries.newNonNestedFilter()).getBitSet(subReaderContext);
-        if (!bits.get(subDocId)) {
+        if (bits != null && !bits.get(subDocId)) {
             return bits.nextSetBit(subDocId);
         }
     }
     return -1;
 }
Suggestion importance[1-10]: 8

__

Why: The getBitSet method can return null when no documents match the filter for a given leaf context, and calling bits.get(subDocId) without a null check would cause a NullPointerException. This is a real bug that needs to be fixed.

Medium
Guard against null settings before component initialization

getPrefetchSettingsSupplier() returns a supplier that accesses
this.tieredStoragePrefetchSettings, which is only initialized in createComponents().
If onIndexModule is called before createComponents(), the supplier will return null,
causing a NullPointerException when isStoredFieldsPrefetchEnabled() is invoked. Add
a null check inside the supplier or ensure initialization order is guaranteed.

modules/tiered-storage/src/main/java/org/opensearch/storage/TieredStoragePlugin.java [166-172]

 @Override
 public void onIndexModule(IndexModule indexModule) {
     if (FeatureFlags.isEnabled(FeatureFlags.WRITABLE_WARM_INDEX_EXPERIMENTAL_FLAG)) {
-        indexModule.addSearchOperationListener(new StoredFieldsPrefetch(getPrefetchSettingsSupplier()));
+        indexModule.addSearchOperationListener(new StoredFieldsPrefetch(() -> {
+            if (this.tieredStoragePrefetchSettings == null) {
+                return null;
+            }
+            return this.tieredStoragePrefetchSettings;
+        }));
     }
 }
Suggestion importance[1-10]: 7

__

Why: tieredStoragePrefetchSettings is only initialized in createComponents(), and if onIndexModule is called before that, the supplier returns null, causing a NullPointerException. The null guard in the improved code addresses this lifecycle issue, though the StoredFieldsPrefetch class would also need to handle a null settings return.

Medium
General
Handle prefetch errors gracefully without aborting

Throwing an exception on the first prefetch failure aborts prefetching for all
remaining documents. Since prefetching is an optimization (not
correctness-critical), it would be more resilient to log the error and continue
processing the remaining documents rather than propagating the exception and
potentially failing the entire fetch phase.

modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java [98-100]

 } catch (Exception e) {
-    throw ExceptionsHelper.convertToOpenSearchException(e);
+    log.warn("Failed to prefetch stored fields for docId: " + docId, e);
 }
Suggestion importance[1-10]: 6

__

Why: Since prefetching is an optimization, throwing an exception on failure aborts the entire fetch phase unnecessarily. Logging and continuing would be more resilient, though this is a design decision that may be intentional.

Low
Avoid double-prefetching nested and root documents

When a document is a root document (not nested), findRootDocumentIfNested returns -1
and subDocId is prefetched correctly. However, when a document IS nested, both
rootDocId and subDocId are prefetched. But subDocId itself may be a nested (child)
document that doesn't have stored fields — prefetching it could be wasteful or
erroneous. Consider only prefetching subDocId when it is not a nested child (i.e.,
when rootDocId == -1).

modules/tiered-storage/src/main/java/org/opensearch/storage/prefetch/StoredFieldsPrefetch.java [93-97]

 final int rootDocId = findRootDocumentIfNested(context, currentReaderContext, subDocId);
 if (rootDocId != -1) {
     currentReader.prefetch(rootDocId);
+} else {
+    currentReader.prefetch(subDocId);
 }
-currentReader.prefetch(subDocId);
Suggestion importance[1-10]: 5

__

Why: The suggestion raises a valid concern about prefetching nested child documents that may not have stored fields, but the behavior may be intentional. The improved code changes the logic significantly and could affect correctness depending on the use case.

Low

@github-actions
Copy link
Contributor

Persistent review updated to latest commit a3c9bdd

@GeekGlider GeekGlider force-pushed the feature/tiered-storage-module branch from a3c9bdd to 799eb95 Compare March 23, 2026 10:02
@github-actions
Copy link
Contributor

Persistent review updated to latest commit 799eb95

@GeekGlider GeekGlider force-pushed the feature/tiered-storage-module branch from 799eb95 to a6eb66a Compare March 23, 2026 10:20
@github-actions
Copy link
Contributor

Persistent review updated to latest commit a6eb66a

@github-actions
Copy link
Contributor

❌ Gradle check result for a6eb66a: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@GeekGlider GeekGlider force-pushed the feature/tiered-storage-module branch from a6eb66a to b00a2bc Compare March 23, 2026 11:25
@github-actions
Copy link
Contributor

Persistent review updated to latest commit b00a2bc

@github-actions
Copy link
Contributor

❌ Gradle check result for b00a2bc: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@GeekGlider GeekGlider force-pushed the feature/tiered-storage-module branch from b00a2bc to 4e87c8d Compare March 23, 2026 12:12
@github-actions
Copy link
Contributor

Persistent review updated to latest commit 4e87c8d

@github-actions
Copy link
Contributor

❌ Gradle check result for 4e87c8d: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Kavya Aggarwal <kavyaagg@amazon.com>
@GeekGlider GeekGlider force-pushed the feature/tiered-storage-module branch from 4e87c8d to f0959d8 Compare March 23, 2026 13:05
@github-actions
Copy link
Contributor

Persistent review updated to latest commit f0959d8

@github-actions
Copy link
Contributor

❌ Gradle check result for f0959d8: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@kkewwei
Copy link
Contributor

kkewwei commented Mar 23, 2026

@GeekGlider Same with PR #20176.

Could you explain why a new module is required to implement this functionality? If the requirement is simply to add prefetching for stored fields, creating a separate module may not be appropriate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants